Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: max-length config #194

Merged
merged 11 commits into from
Apr 25, 2023
Merged

Conversation

rocktimsaikia
Copy link
Contributor

@rocktimsaikia rocktimsaikia commented Apr 9, 2023

fixes #184 #91
closes #120

Setting the default to 50 characters.

This option is very much needed. So I sped things up for this.
cc @privatenumber

src/utils/config.ts Outdated Show resolved Hide resolved
src/utils/openai.ts Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
Copy link
Contributor Author

@rocktimsaikia rocktimsaikia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@privatenumber I made the changes. Please have a look

}

parseAssert('length', /^\d+$/.test(length), 'Must be an integer');

const parsed = Number(length);
parseAssert('length', parsed >= 5, 'Must be greater than 5 tokens(20 characters))');
parseAssert('length', parsed >= 20, 'Must be greater than 20 characters');
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can change. You can decide the minimum limit. 20 seems alright to me.

@@ -100,18 +97,45 @@ const sanitizeMessage = (message: string) => message.trim().replace(/[\n\r]/g, '

const deduplicateMessages = (array: string[]) => Array.from(new Set(array));

const getPrompt = (locale: string, diff: string) => `Write a git commit message in present tense for the following diff without prefacing it with anything. Do not be needlessly verbose and make sure the answer is concise and to the point. The response must be in the language ${locale}:\n${diff}`;
const getPrompt = (locale: string, diff: string, length: number) => `Write a git commit message in present tense for the following diff without prefacing it with anything. Do not be needlessly verbose and make sure the answer is concise and to the point. The response must be no longer than ${length} characters. The response must be in the language ${locale}:\n${diff}`;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we pad it by 5 in the prompt limit(${length}) too? RN we are only padding it in char to token conversion. But padding here might result in 50+ characters in a commit message eventually.

@rocktimsaikia
Copy link
Contributor Author

Friendly ping 👋 @privatenumber

README.md Outdated Show resolved Hide resolved
src/utils/config.ts Outdated Show resolved Hide resolved
@@ -1,7 +1,8 @@
import https from 'https';
import type { ClientRequest, IncomingMessage } from 'http';
import type { CreateChatCompletionRequest, CreateChatCompletionResponse } from 'openai';
import { type TiktokenModel } from '@dqbd/tiktoken';
// eslint-disable-next-line camelcase
import { TiktokenModel, encoding_for_model } from '@dqbd/tiktoken';
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you break up each named import into its own line so the eslint disable directive can apply specifically to encoding_for_model?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doing that will result in import/no-duplicates error

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is what I mean:

import {
	TiktokenModel,
	// eslint-disable-next-line camelcase
	encoding_for_model,
} from '@dqbd/tiktoken';

// Padded by 5 for more room for the completion.
const stringFromLength = generateStringFromLength(length + 5);
const tokenFromLength = getTokens(stringFromLength, model);
const tokenFromPrompt = getTokens(prompt, model);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to concatenate prompt + stringFromLength and pass it into getTokens once?

That way you can remove the getTokens util.

for (let i = 0; i < length; i += 1) {
const randomIndex = Math.floor(Math.random() * characters.length);
result += characters.charAt(randomIndex);
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the highest token character? I think we can just use that instead of generating a random value every time.

@rocktimsaikia
Copy link
Contributor Author

@privatenumber The changes are done. Please have a look

@rocktimsaikia
Copy link
Contributor Author

@privatenumber ICYMI

@privatenumber
Copy link
Collaborator

Haven't forgotten (just backed up right now), but thanks for the reminder.

@privatenumber
Copy link
Collaborator

Going to rename this to max-length

@rocktimsaikia
Copy link
Contributor Author

That's alright 👍

@privatenumber privatenumber changed the title feat: length config feat: max-length config Apr 25, 2023
@privatenumber privatenumber merged commit edce283 into Nutlope:develop Apr 25, 2023
@privatenumber
Copy link
Collaborator

I think the test keeps failing because max-length sets a default length of 50, and that's shared for multiple results:
https://github.com/Nutlope/aicommits/actions/runs/4798973377/jobs/8552259028

Do you think you can open a PR to multiply it for the number of options to generate?

@rocktimsaikia
Copy link
Contributor Author

Okey I will look into that

@privatenumber
Copy link
Collaborator

Actually, based on the docs I think n multiplies max_tokens:
https://platform.openai.com/docs/api-reference/completions/create#completions/create-prompt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Setting maximum line length
2 participants